Using R to Orchestrate APIs

A presentation for Research Data at the Edge, Day One of Duke Research Computing Symposium

Hosted by the Data & Visualization Services Department.

Presentation materials composed in Rmarkdown using Rstudio, stored in a Github Repository, Served via Github Pages.

Outline

Why?

The Web has lots of stuff

  • frontier beyond curated datasets
  • stuff is wrapped in HTML
  • HTML is transported over HTTP but composed for h2m consumption
  • Intellectual Property rights bear serious consideration

API

Application Program Interface

  • Built for machine-to-machine interactions
  • Instructions for programs

Client / Server

  • Make [R] interface with the web
  • Same as h2m but now m2m

Human Simulation

A dramatization…

  • Person uses Web Client
    • Person enters a URL

    • client & server negotiate
      dramatization: good handshake
    • Information is sent back in wrapped HTML
    • Web Browser parses the HTML

m2m – development

dramatization: confused about the protocol

dramatization: confused about the protocol

JSON

# from https://en.wikipedia.org/wiki/JSON
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 25,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": null
}

Example

To Follow Along

  1. Open an RStudio Docker Container - https://vm-manage.oit.duke.edu/containers/rstudio
  2. Project > New Project
  3. Version Contrl > Git
  4. Repository URL = https://github.com/libjohn/r-api-json.git > Create Project
  5. GoTo Line 150-ish (“### Demonstration”) of the API-JSON-SYMPOSIUM.Rmd file

Demonstration

library(jsonlite)
# https://cran.r-project.org/web/packages/jsonlite/vignettes/json-aaquickstart.html
# for building tibbles
library(tidyverse)

Single JSON array

When the server response is a single JSON array, JSONlite makes viewing the data pretty simple.

oneJSONresult <- fromJSON("http://www.omdbapi.com/?t=rocky&y=&plot=full&r=json")

Let’s see the results in the next slide


oneJSONresult
$Title
[1] "Rocky"

$Year
[1] "1976"

$Rated
[1] "PG"

$Released
[1] "03 Dec 1976"

$Runtime
[1] "120 min"

$Genre
[1] "Drama, Sport"

$Director
[1] "John G. Avildsen"

$Writer
[1] "Sylvester Stallone"

$Actors
[1] "Sylvester Stallone, Talia Shire, Burt Young, Carl Weathers"

$Plot
[1] "Rocky Balboa is a struggling boxer trying to make the big time, working as a debt collector for a pittance. When heavyweight champion Apollo Creed visits Philadelphia, his managers want to set up an exhibition match between Creed and a struggling boxer, touting the fight as a chance for a \"nobody\" to become a \"somebody\". The match is supposed to be easily won by Creed, but someone forgot to tell Rocky, who sees this as his only shot at the big time."

$Language
[1] "English"

$Country
[1] "USA"

$Awards
[1] "Won 3 Oscars. Another 16 wins & 21 nominations."

$Poster
[1] "https://images-na.ssl-images-amazon.com/images/M/MV5BMTY5MDMzODUyOF5BMl5BanBnXkFtZTcwMTQ3NTMyNA@@._V1_SX300.jpg"

$Metascore
[1] "N/A"

$imdbRating
[1] "8.1"

$imdbVotes
[1] "387,927"

$imdbID
[1] "tt0075148"

$Type
[1] "movie"

$Response
[1] "True"

The vector object behaves as you would expect in R.
  • You can list all the variable names.
names(oneJSONresult)
 [1] "Title"      "Year"       "Rated"      "Released"   "Runtime"    "Genre"      "Director"   "Writer"     "Actors"    
[10] "Plot"       "Language"   "Country"    "Awards"     "Poster"     "Metascore"  "imdbRating" "imdbVotes"  "imdbID"    
[19] "Type"       "Response"  
  • List an individual element
oneJSONresult$Title
[1] "Rocky"
oneJSONresult$Awards
[1] "Won 3 Oscars. Another 16 wins & 21 nominations."

A JSON Matrix

The results of this code-snippet react differently between the console, the Notebook script (console), and the Notebook HTML output. In the Notebook script-output you can find the component name, in this case dollar-search: $Search. Or, you can use bracket notation: [[1]]. Once you identify the component name, it’s easier to identify the element names.

jsonSeriesResutlsMatrix <- fromJSON("http://www.omdbapi.com/?s=rocky&type=series&r=json&page=1")
jsonSeriesResutlsMatrix
$Search

$totalResults
[1] "20"

$Response
[1] "True"

Call the search results and coerce the JSON array into a data frame.

jsonSeriesResutlsMatrix$Search

jsonSeriesResutlsMatrix$Search$Title
 [1] "Rocky and His Friends"         "Dr. Jeff: Rocky Mountain Vet"  "Rocky Jones, Space Ranger"     "Rocky Mountain Law"           
 [5] "Rocky King, Detective"         "Rocky Road"                    "Rocky Mountain Bounty Hunters" "Rocky + Drago"                
 [9] "Rocky Point"                   "Rocky Star"                   

Resources

LS0tDQp0aXRsZTogIlVzaW5nIFIgdG8gT3JjaGVzdHJhdGUgQVBJcyINCmF1dGhvcjogIkpvaG4gTGl0dGxlIg0KZGF0ZTogJ2ByIFN5cy5EYXRlKClgJw0Kb3V0cHV0Og0KICBodG1sX25vdGVib29rOiBkZWZhdWx0DQogIHNsaWR5X3ByZXNlbnRhdGlvbjogZGVmYXVsdA0KLS0tDQojIyBVc2luZyBSIHRvIE9yY2hlc3RyYXRlIEFQSXMNCg0KQSBwcmVzZW50YXRpb24gZm9yIFtSZXNlYXJjaCBEYXRhIGF0IHRoZSBFZGdlXShodHRwOi8vbGlicmFyeS5kdWtlLmVkdS9lZGdlL2V2ZW50cy9yYzE3KSwgRGF5IE9uZSBvZiBbRHVrZSBSZXNlYXJjaCBDb21wdXRpbmcgU3ltcG9zaXVtXShodHRwczovL3JjLmR1a2UuZWR1L3N5bXBvc2l1bS0yMDE3LykNCg0KSG9zdGVkIGJ5IHRoZSBbRGF0YSAmIFZpc3VhbGl6YXRpb24gU2VydmljZXNdKGh0dHA6Ly9saWJyYXJ5LmR1a2UuZWR1L2RhdGEvKSBEZXBhcnRtZW50LiAgDQoNClByZXNlbnRhdGlvbiBtYXRlcmlhbHMgY29tcG9zZWQgaW4gKlJtYXJrZG93biogdXNpbmcgKlJzdHVkaW8qLCBzdG9yZWQgaW4gYSAqR2l0aHViIFJlcG9zaXRvcnkqLCBTZXJ2ZWQgdmlhICpHaXRodWIgUGFnZXMqLiAgDQoNCiogZ2l0aHViIFJlcG8gLS0gaHR0cHM6Ly9naXRodWIuY29tL2xpYmpvaG4vci1hcGktanNvbiANCiogU2xpZGVzIC0tIGh0dHBzOi8vbGliam9obi5naXRodWIuY29tL3JjczIwMTcvc2xpZGVzLmh0bWwNCiogTm90ZWJvb2sgLS0gaHR0cDovL2xpYmpvaG4uZ2l0aHViLmlvL3JjczIwMTcvbm90ZWJvb2suaHRtbCANCg0KDQoNCiMjIE91dGxpbmUNCg0KKiBBUEkNCiogSlNPTg0KKiBSIC8gUlN0dWRpbw0KDQojIyBXaHk/DQoNCiMjIyBUaGUgV2ViIGhhcyBsb3RzIG9mIHN0dWZmDQorIGZyb250aWVyIGJleW9uZCBjdXJhdGVkIGRhdGFzZXRzDQorIHN0dWZmIGlzIHdyYXBwZWQgaW4gSFRNTA0KKyBIVE1MIGlzIHRyYW5zcG9ydGVkIG92ZXIgSFRUUCBidXQgY29tcG9zZWQgZm9yIGgybSBjb25zdW1wdGlvbg0KKyBJbnRlbGxlY3R1YWwgUHJvcGVydHkgcmlnaHRzIGJlYXIgc2VyaW91cyBjb25zaWRlcmF0aW9uDQoNCjwhLS0gTkFTQSBhbmltYXRlZCBHSUYgLy8vICBodHRwOi8vaS5naXBoeS5jb20vbDJKaHQ0bElmRVFmSjN6ajIuZ2lmICAgIC0tPiANCjwhLS0gIGdvb2QgaHVtYW4gaGFuZHNoYWtlIC8vLyAgaHR0cDovL2dpcGh5LmNvbS9naWZzL3Rob21hcy1VMlhib1J1Tjg5SWRpIC0tPg0KPCEtLSBhZnRlciB0aGUgcmVzZWFyY2ggaGFuZHNoYWtlIGlzIGNvbXBsZXRlIC8vLyBodHRwOi8vZ2lwaHkuY29tL2dpZnMvODBzLTE5ODBzLXRob21hcy1kb2xieS13Q0ttQmQ3b050QTRnICAtLT4gDQo8IS0tIHRoZSBjb25mdXNpb24gb2YgdGhlIG0ybSBoYW5kc2hha2UgLy8vICAgaHR0cDovL2dpcGh5LmNvbS9naWZzL3Rob21hcy1NamtDWWpNNDZOcnJPIC0tPg0KDQojIyBBUEkNCg0KIyMjIEFwcGxpY2F0aW9uIFByb2dyYW0gSW50ZXJmYWNlIA0KDQoqIEJ1aWx0IGZvciBtYWNoaW5lLXRvLW1hY2hpbmUgaW50ZXJhY3Rpb25zDQoqIEluc3RydWN0aW9ucyBmb3IgcHJvZ3JhbXMNCg0KPCEtLSBodHRwOi8vbW9iaWxlLWdwcy5uZXQvMjAxNS8wMS8gLS0+DQohW10oaW1hZ2VzL2FwaS5wbmcpDQoNCg0KLS0tICAgIA0KDQojIyMgQ2xpZW50IC8gU2VydmVyIA0KDQoNCiFbXShpbWFnZXMvQ2xpZW50LXNlcnZlci1tb2RlbC5zdmcucG5nKSANCg0KKiBNYWtlIFtSXSBpbnRlcmZhY2Ugd2l0aCB0aGUgd2ViDQoqIFNhbWUgYXMgaDJtIGJ1dCBub3cgbTJtDQoNCg0KPCEtLSBodHRwczovL3BpeGFiYXkuY29tL2VuL2NsaWVudC1zZXJ2ZXItbmV0d29ya2luZy1sYXB0b3AtMzQxNDIwLyAtLT4NCi0tLSAgDQoNCiMjIyBIdW1hbiBTaW11bGF0aW9uDQoNCiMjIyMgQSBkcmFtYXRpemF0aW9uLi4uDQoNCiogUGVyc29uIHVzZXMgV2ViIENsaWVudA0KICAgICsgUGVyc29uIGVudGVycyBhIFVSTDxicj4NCiAgICAhW10oaW1hZ2VzL1VSTC5QTkcpDQogICAgDQogICAgKyBjbGllbnQgJiBzZXJ2ZXIgbmVnb3RpYXRlPGJyPiANCiAgICAhW2RyYW1hdGl6YXRpb246IGdvb2QgaGFuZHNoYWtlXShpbWFnZXMvZ29vZC1oYW5kc2hha2UuZ2lmKSANCiAgICArIEluZm9ybWF0aW9uIGlzIHNlbnQgYmFjayBpbiB3cmFwcGVkIEhUTUwNCiAgICArIFdlYiBCcm93c2VyIHBhcnNlcyB0aGUgSFRNTCANCiAgICANCjwhLS0gaHR0cHM6Ly9jb21tb25zLndpa2ltZWRpYS5vcmcvd2lraS9GaWxlOlVuaWZvcm1fUmVzb3VyY2VfTG9jYXRvcl8oVVJMKV9leGFtcGxlLlBORyAtLT4NCjwhLS0gaHR0cHM6Ly9jb21tb25zLndpa2ltZWRpYS5vcmcvd2lraS9GaWxlOkhUTUwuc3ZnIC0tPg0KDQojIyBtMm0gLS0gZGV2ZWxvcG1lbnQNCg0KDQohW2RyYW1hdGl6YXRpb246IGNvbmZ1c2VkIGFib3V0IHRoZSBwcm90b2NvbF0oaW1hZ2VzL2RldmVsb3BtZW50LWNvbmZ1c2lvbi5naWYpDQogICAgDQojIyBKU09ODQoNCiogW0phdmFzY3JpcHQgT2JqZWN0IE5vdGF0aW9uXShodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9KU09OKSBpcyBhIGxhbmd1YWdlLWluZGVwZW5kZW50IGRhdGEgZm9ybWF0DQoqIEN1cnJlbnRseSB0aGUgbW9zdCBjb21tb24gZGF0YSBkYXRhIGZvcm1hdCBmb3IgYXN5bmNocm9ub3VzIGNsaWVudC9zZXJ2ZXIgY29tbXVuaWNhdGlvbiBmb3JtYXQNCiogQ29uc2lzdHMgb2Yga2V5LXZhbHVlIHBhaXJzDQoNCjwhLS0gaHR0cDovL2kudmltZW9jZG4uY29tL3ZpZGVvLzU0MTkzNTgxNl8xMjgweDcyMC5qcGcgLS0+DQo8IS0tIFZpbWVvIG9uIFdoYXQgaXMgSlNPTiAvLyBodHRwczovL3ZpbWVvLmNvbS8xNDQxNjIxMDIgLS0+DQoNCg0KYGBge2pzb24gZXhhbXBsZX0NCiMgZnJvbSBodHRwczovL2VuLndpa2lwZWRpYS5vcmcvd2lraS9KU09ODQp7DQogICJmaXJzdE5hbWUiOiAiSm9obiIsDQogICJsYXN0TmFtZSI6ICJTbWl0aCIsDQogICJpc0FsaXZlIjogdHJ1ZSwNCiAgImFnZSI6IDI1LA0KICAiYWRkcmVzcyI6IHsNCiAgICAic3RyZWV0QWRkcmVzcyI6ICIyMSAybmQgU3RyZWV0IiwNCiAgICAiY2l0eSI6ICJOZXcgWW9yayIsDQogICAgInN0YXRlIjogIk5ZIiwNCiAgICAicG9zdGFsQ29kZSI6ICIxMDAyMS0zMTAwIg0KICB9LA0KICAicGhvbmVOdW1iZXJzIjogWw0KICAgIHsNCiAgICAgICJ0eXBlIjogImhvbWUiLA0KICAgICAgIm51bWJlciI6ICIyMTIgNTU1LTEyMzQiDQogICAgfSwNCiAgICB7DQogICAgICAidHlwZSI6ICJvZmZpY2UiLA0KICAgICAgIm51bWJlciI6ICI2NDYgNTU1LTQ1NjciDQogICAgfSwNCiAgICB7DQogICAgICAidHlwZSI6ICJtb2JpbGUiLA0KICAgICAgIm51bWJlciI6ICIxMjMgNDU2LTc4OTAiDQogICAgfQ0KICBdLA0KICAiY2hpbGRyZW4iOiBbXSwNCiAgInNwb3VzZSI6IG51bGwNCn0NCmBgYA0KDQoNCiMjIEV4YW1wbGUNCg0KIyMjIFRvIEZvbGxvdyBBbG9uZw0KMS4gT3BlbiBhbiBSU3R1ZGlvIERvY2tlciBDb250YWluZXIgLSBodHRwczovL3ZtLW1hbmFnZS5vaXQuZHVrZS5lZHUvY29udGFpbmVycy9yc3R1ZGlvIA0KMi4gUHJvamVjdCA+IE5ldyBQcm9qZWN0DQozLiBWZXJzaW9uIENvbnRybCA+IEdpdCANCjQuIFJlcG9zaXRvcnkgVVJMID0gaHR0cHM6Ly9naXRodWIuY29tL2xpYmpvaG4vci1hcGktanNvbi5naXQgPiBDcmVhdGUgUHJvamVjdCANCjUuIEdvVG8gTGluZSAxNTAtaXNoICgiIyMjIERlbW9uc3RyYXRpb24iKSBvZiB0aGUgKkFQSS1KU09OLVNZTVBPU0lVTS5SbWQqIGZpbGUNCg0KDQojIyMgT01EQiBhcGkgDQoNCi0gaHR0cDovL3d3dy5vbWRiLm9yZy8NCiAgICAtIGxpa2UgaHR0cDovL2ltZGIuY29tLw0KLSBubyBBUEkga2V5cyByZXF1cmllZA0KLSBodHRwOi8vd3d3Lm9tZGJhcGkuY29tLw0KDQotLS0gDQoNCiMjIyBEZW1vbnN0cmF0aW9uDQoNCg0KYGBge3IgbG9hZC1saWJyYXJ5LXBhY2thZ2UsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9VFJVRX0NCmxpYnJhcnkoanNvbmxpdGUpDQojIGh0dHBzOi8vY3Jhbi5yLXByb2plY3Qub3JnL3dlYi9wYWNrYWdlcy9qc29ubGl0ZS92aWduZXR0ZXMvanNvbi1hYXF1aWNrc3RhcnQuaHRtbA0KDQojIGZvciBidWlsZGluZyB0aWJibGVzDQpsaWJyYXJ5KHRpZHl2ZXJzZSkNCmBgYA0KDQoNCiMjIyBTaW5nbGUgSlNPTiBhcnJheQ0KV2hlbiB0aGUgc2VydmVyIHJlc3BvbnNlIGlzIGEgc2luZ2xlIEpTT04gYXJyYXksIEpTT05saXRlIG1ha2VzIHZpZXdpbmcgdGhlIGRhdGEgcHJldHR5IHNpbXBsZS4NCmBgYHtyIHNpbmdsZUpTT05yZXN1bHR9DQpvbmVKU09OcmVzdWx0IDwtIGZyb21KU09OKCJodHRwOi8vd3d3Lm9tZGJhcGkuY29tLz90PXJvY2t5Jnk9JnBsb3Q9ZnVsbCZyPWpzb24iKQ0KYGBgDQoNCkxldCdzIHNlZSB0aGUgcmVzdWx0cyBpbiB0aGUgbmV4dCBzbGlkZQ0KDQotLS0NCg0KYGBge3J9DQpvbmVKU09OcmVzdWx0DQpgYGANCg0KDQotLS0gDQoNCiMjIyMjIFRoZSB2ZWN0b3Igb2JqZWN0IGJlaGF2ZXMgYXMgeW91IHdvdWxkIGV4cGVjdCBpbiBSLiAgDQoNCi0gWW91IGNhbiBsaXN0IGFsbCB0aGUgdmFyaWFibGUgbmFtZXMuDQoNCmBgYHtyfQ0KbmFtZXMob25lSlNPTnJlc3VsdCkNCmBgYA0KDQotIExpc3QgYW4gaW5kaXZpZHVhbCBlbGVtZW50DQoNCg0KYGBge3J9DQpvbmVKU09OcmVzdWx0JFRpdGxlDQpgYGANCg0KYGBge3J9DQpvbmVKU09OcmVzdWx0JEF3YXJkcw0KYGBgDQoNCg0KLS0tDQoNCiMjIyBBIEpTT04gTWF0cml4DQpUaGUgKipyZXN1bHRzIG9mIHRoaXMgY29kZS1zbmlwcGV0IHJlYWN0IGRpZmZlcmVudGx5KiogYmV0d2VlbiB0aGUgKmNvbnNvbGUqLCB0aGUgKk5vdGVib29rIHNjcmlwdCogKGNvbnNvbGUpLCBhbmQgdGhlICpOb3RlYm9vayBIVE1MKiBvdXRwdXQuICBJbiB0aGUgTm90ZWJvb2sgc2NyaXB0LW91dHB1dCB5b3UgY2FuIGZpbmQgdGhlIGNvbXBvbmVudCBuYW1lLCBpbiB0aGlzIGNhc2UgZG9sbGFyLXNlYXJjaDogYCRTZWFyY2hgLiAgT3IsIHlvdSBjYW4gdXNlIGJyYWNrZXQgbm90YXRpb246IGBbWzFdXWAuICBPbmNlIHlvdSBpZGVudGlmeSB0aGUgY29tcG9uZW50IG5hbWUsIGl0J3MgZWFzaWVyIHRvIGlkZW50aWZ5IHRoZSBlbGVtZW50IG5hbWVzLg0KYGBge3J9DQpqc29uU2VyaWVzUmVzdXRsc01hdHJpeCA8LSBmcm9tSlNPTigiaHR0cDovL3d3dy5vbWRiYXBpLmNvbS8/cz1yb2NreSZ0eXBlPXNlcmllcyZyPWpzb24mcGFnZT0xIikNCmpzb25TZXJpZXNSZXN1dGxzTWF0cml4DQpgYGANCg0KLS0tICANCg0KIyMjIENhbGwgdGhlIHNlYXJjaCByZXN1bHRzIGFuZCBjb2VyY2UgdGhlIEpTT04gYXJyYXkgaW50byBhIGRhdGEgZnJhbWUuDQpgYGB7cn0NCmpzb25TZXJpZXNSZXN1dGxzTWF0cml4JFNlYXJjaA0KYGBgDQoNCi0tLSANCmBgYHtyfQ0KanNvblNlcmllc1Jlc3V0bHNNYXRyaXgkU2VhcmNoJFRpdGxlDQpgYGANCg0KDQojIyBSIFBhY2thZ2VzIC0tIFJlbGF0ZWQNCg0KKlBlb3BsZSB3aG8gdXNlIEpTT05saXRlIGFsc28gdXNlLi4uKg0KDQoqIFtodHRyXShodHRwczovL2NyYW4uci1wcm9qZWN0Lm9yZy93ZWIvcGFja2FnZXMvaHR0ci8pIC0tIGNhbGxzIEpTT05saXRlIGluIHNlcnZpY2UgdG8gbWFqb3IgZ29hbCBvZiBtYW5hZ2luZyBIVFRQIA0KKiBbcnZlc3RdKGh0dHBzOi8vYmxvZy5yc3R1ZGlvLm9yZy8yMDE0LzExLzI0L3J2ZXN0LWVhc3ktd2ViLXNjcmFwaW5nLXdpdGgtci8pIC0tICB1c2VkIGZvciBodG1sIHBhcnNpbmcNCg0KIyMgUmVzb3VyY2VzIA0KDQotIFJTdHVkaW8gaHR0UiB2aWRlbw0KLSBKU09ObGl0ZSBwYWNrYWdlDQotIGxpc3RvZiBpbWFnZXMNCi0gTW92aWVzIG9mIDE5NzYNCiAgICAtIFtPTURCIFRvcCBNb3ZpZXNdKGh0dHA6Ly93d3cub21kYi5vcmcvZW5jeWNsb3BlZGlhL3llYXIvMTk3Ni9zdGF0aXN0aWNzKQ0KICAgIC0gW0lNREIgTW9zdCBQb3B1bGFyXShodHRwOi8vd3d3LmltZGIuY29tL3llYXIvMTk3Ni8pDQoNCg==